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Abstract 

We introduce an algorithm for generating a random sequence of fragmentation trees, which we call the 
ancestral branching algorithm. This algorithm builds on the recursive partitioning structure of a tree and gives 
rise to an associated family of Markovian transition kernels whose finite-dimensional transition probabilities 
can be written in closed-form as the product over partition-valued Markov kernels. The associated tree- 



(*J . valued Markov process is infinitely exchangeable provided its associated partition-valued kernel is infinitely 



exchangeable. We also identify a transition procedure on partitions, called the cut-and-paste algorithm, which 
corresponds to a previously studied partition-valued Markov process on partitions with a bounded number 
of blocks. Specifically, we discuss the corresponding family of tree-valued Markov kernels generated by the 
combination of both the ancestral branching and cut-and-paste transition probabilities and show results for 
the equilibrium measure of this process, as well as its associated mass fragmentation-valued and weighted 
tree-valued processes. 

(N' 

> 

(T> ' 1 Some preliminaries 

<N 

[^ . A set partition B of the natural numbers N is a collection {B i , B2 , . . . } of disjoint non-empty subsets of N, called 

blocks, such that (J,fi; = N. In general, we assume the blocks of B are unordered, but whenever we wish to 
emphasize that blocks are listed in a particular order we write B = (B\,B2, ...). Write £? to denote the space of 
set partitions of N. 



S^ For Be^ and b£B,#B is the number of blocks of B and #b is the number of elements of b. We write 

to denote the space of partitions of N with at most k > 1 blocks, i.e. ^W := {B € 0* : #fi < k}. For a partition 
B with blocks {B\,B2, . ..} and any A C N, let Bu denote the restriction of B to A, i.e. Bu := {B[ Pi A : i > 1} 

(excluding the empty set). We write I^ A and g?\ to denote the restriction to A of & and S? 1 ^ respectively. In 

particular, for n € N, ^r B i and SrtJ are the restriction to [n] := {1, .. . ,n} of 3? and ^w respectively. 

For each n 6 N, we define the deletion operation D n : 2 N —> 2 N which acts on subsets of N by removing {n} 
from A, i.e. A i-> D n A := A\{n} for each A C N. In general, for A,B C N non-empty, D B A := A\B = A — B = 
AP\B C . For each n > 1, we define the deletion operation on partitions D, hn+ \ : ^[„ + i] — > ^u in terms of D n+ \ by 
D nn+ \B = B\^ := {D n+ \b :b€B} for every BG ^[„+i], and for m <n define D„,„ := D mjOT+ i o • • • oD n _i >n . The 
finite spaces {JP^n > 1) together with all deletion (D mn ,m <n) and permutation maps, and their compositions, 
defines a projective system of set partitions. 

A sequence (B\ ,...) such that B n G ^ui for each n > 1 is said to be compatible HB n = D nM+ iB n+ \ for each 
n > 1. Any Be^ 1 can be represented as the compatible sequence of its finite restrictions, (Bir n i,n > 1), and we 
often write B := (Bir„i,n > 1). 



1.1 Fragmentation trees 

For any subset A C N, a collection of non-empty subsets T C 2 A , the power set of A, is an A-labeled rooted tree 
if 

(i) A S T, called the root of T and denoted root(r) = A, and 
(ii) A,B € T implies A C\B S {0,A,5}. That is, either A and B are disjoint or one is a subset of the other. 

If T contains all singleton subsets of A, T is called a. fragmentation tree. Throughout the rest of this paper, the 
word tree and fragmentation are both understood to mean fragmentation tree. We write 3a to denote the space 
of fragmentations of A and & = 3^ to denote the space of fragmentations of N. 

As a collection of subsets of A C N, the elements of T € ^ are partially ordered by inclusion. That is, if 
A,Ber such that A C B, then the intervals [A,B],(A,B], and [A,B) are well-defined subsets of T. This partial 
ordering induces a natural genealogical interpretation of the relationships among the elements of a tree. For each 
t 6 T, the subset anc(?) := (t,A] := {s G T : t C s} denotes the set of ancestors of t. Note that anc(root(r)) = 
and for each t ^ root(r), anc(f) has a least element denoted by pa(f) := minanc(f), the. parent of t. 

Conversely, except for the singleton elements of T, each t € T is the parent of some collection of subsets 
of T, called the children of t, which is given by pa _1 (?) := frag(f) := {t' G T : pa(f') = t}. For finite A C N 
and T £ 3a, frag(f) forms a non-trivial partition of t for each non-singleton t £ T. In particular, for each 
finite subset A C N and any tree T £ 3a, the children of root(r) form a well-defined root partition, denoted 
Ht := rp(r) := frag(root(r)). The fragmentation degree of T is given by max, e 7-#frag(f), which may be 
infinite. For k > 1 , we write 3? A to denote the collection of trees of A with fragmentation degree at most k. 

For any subset S C A, the restriction of T £ ^ to 5 is defined by 7j s := {Sn? : ? G T} (excluding the empty 
set), the reduced sub-tree of Aldous [2]. Recall the deletion operation D s : 2 N — > 2 N defined above by restriction 
to the complement ofS. For any tree T € 5a and 5 C A, D^r := {D$t : t £ T} = {t n S c : t £ T} = T\ AnS c. We 
use the notation D n ^ n+ \ : 3? n+ \ — > 3T n to denote the operation D, hn+ \T := Zjr n i on trees. Note that the apparent 
overloading of D n ^ n+ \ as a function on both &\ n +i] and =% + i should cause no confusion as it is fundamentally 
defined, in both cases, as a function on collections of subsets of N through the set operation D n+ \. 

As in the description of partitions of N, any fragmentation T E 3? can be expressed as a compatible se- 
quence (7|[„i,n > 1) of reduced subtrees on the projective system of [«]-labeled trees (<%,n > 1) together with 
deletion {D m ^ n ,m < n) and permutation maps. For T € ST , we often write T := (Tjui,/! > 1). 



2 Summary of main results 

Our main result is the description of an explicit random algorithm for generating a sequence of fragmentation 
trees and conditions under which this algorithm characterizes an infinitely exchangeable Markov process on 2? , 
which turns out to be quite general. Later, we discuss a special subclass of this family of tree- valued processes 
for which we can establish the Feller property and existence of associated processes on mass fragmentations 
and weighted trees. This subfamily can give rise to an infinitely exchangeable process on, for example, binary 
trees, which could have implications in certain areas of inference for unknown phylogenetic trees. The asso- 
ciated weighted tree- valued process may also be applicable to certain aspects of hidden Markov modeling in a 
phylogenetic setting. 



Previously, random algorithms, e.g. subtree prune-regraft (SPR), genetic algorithms, neighbor-joining, etc., 
have been described in the context of Markov chain Monte Carlo (MCMC) and searching the space of trees in 
the context of inference of unknown phylogenetic trees, see e.g. Felsenstein |[T6l for an overview. In particular, 
Evans and Winter [ 14 ] study a tree- valued process based on an SPR algorithm which is reversible with respect 
to Aldous's continuum random tree (CRT) [2]. Previously, Aldous and Pitman [3] studied a tree-valued process 
based on SPR and its connection to the Galton- Watson process. 

Below we introduce a random algorithm, which we call the ancestral branching algorithm, which is of 
a different nature than those previously studied in this context and generates different sample paths on the 
space of fragmentation trees than its predecessors. This procedure admits an explicit expression for finite- 
dimensional Markovian transition probabilities which is of an intuitive form, and can be related to the notion 
of successive partitioning of a set which is common in the study of fragmentation processes. We subsequently 
show a construction of an infinitely exchangeable process which evolves according to ancestral branching, as 
well as connections to Poisson point processes, mass fragmentations and weighted trees. 



2.1 Ancestral branching kernels 

A Markov kernel on a set =e/ is a collection {p(x, •) : x € £/} of probability distributions on &f indexed by the 
elements of &/. In particular, for any A C N, a Markov kernel on &a is a collection Pa := {p(B, ■) : B G &a\ of 
probability distributions on S^a indexed by the elements of &a- 

Let A C N be a finite subset such that #A > 2 and let {P$ : S C A} be a collection of Markov kernels on 
&$ for all S C A. Given T € STa, a fragmentation of A, generate a new fragmentation T' G 3?a by the following 
procedure. 



Ancestral Branching (AB) Algorithm 

(i) PutF:={A}. 

(ii) Pick any b from F such that #b>2. 

(iii) Generate Kb from pt,(T\ T ,•), the transition measure on ^ with initial state given by the root partition of 
the reduced subtree 7k, independently of everything generated previously. 

(iv) If Kb = lb = {b}, discard and repeat step (iii) for b; otherwise, put Yl T i = Kb, i.e. define the children of b 

\b 

in T' , frag (ft), by the blocks of Kb- 
(v) Remove b from F and add the blocks of Kb to F, i.e. F \-> (F — b) U frag(ft). 
(vi) If there is a non-singleton element of F, i.e. #{b G F : #b > 2} > 0, go to (ii); otherwise, stop. 

If we assume for each b C A that pb(B, lb) < 1 for each B £ &b, then frag(ft) is almost surely generated in a 
finite number of steps in (iii) and (iv). By assuming A is a finite set, we have that the above algorithm runs in a 
finite number of steps with probability one. 

Henceforth, we shall assume the partition-valued kernels {/ty(v) : b C N} satisfy pb{-,lb) < 1 for every 
b C N. Under this condition, it is straightforward to show that the above algorithm culminates in a transition 



probability Qa(T, •) on S?a, which we can express in closed form by 

p b (U T Uj.) 
Q A (T,T')= n 1 m iv (1) 

the product of Markov kernels on the root partitions of the reduced subtrees of all parents of T conditioned to 
be non-trivial, i.e. not the one block partition l b . 

To see this, note that for each b £T' we generate rTj-' independently of all other random partitions generated 
by this algorithm. Therefore, we can write Qa as a product over {b € T : #b > 2} of conditional probabilities 



VIL/ =n\T = t),le. 



Q A (t,t')= n Pb(n t[i =n\T=t). 

bet':#b>2 



From (iii) and (iv), we have that 



Pb(P-t\ b ,Xb) 



p b (n T ,=x h \T = t) = J £p h (n tlb ,x h ) Pb (n t i b y = - £- 

for each b € t', which gives us (Q]). By a straightforward induction argument, one can easily show that the sum 
of dTJ over the elements of S?a equals one, and so (Q} defines a Markov kernel on S?a> 

We call any Markov kernel on 2?a of the form (Q]) an ancestral branching (AB) Markov kernel on J3a- It is 
clear that transitions T i->- T' on SFa governed by an AB kernel Qa{-, •) can be generated according to the AB 
algorithm by taking the transition probabilities in step (iii) to be the p b (-,-) use d in the product of £T|). 

For A C N with 2 < #A < °°, the form of £Q) admits the recursive expression 

Qa(T,T>) = , PA ^ T '\ EI Qb{T\ b ,T( b ), (2) 

i-PA{n T ,i A ) benTi 

which has an intuitive interpretation in terms of independent self-similar transitions on the space of reduced 
subtrees of the children of the root of T . The reader familiar with the literature on fragmentation processes 
may draw parallels to the usual description of a fragmentation process in terms of successive partitioning of 
fragments, see e.g. ||8j|2Tl. Indeed, the specification in (Q]) is related to this specification, but has the added 
feature of including a Markovian dependence on the previous state in a sequence of fragmentation trees, which 
has not previously appeared in the study of tree- valued processes. 

The Markovian branching algorithm in section |2~T1 only requires associated ^-valued transition probabil- 
ities to be defined on ^s\{^s} f° r eacn S C A. However, in our treatment we always assume that we have a 
family of transition probabilities which is well-defined on the full space &$ an d satisfies p$(-,ls) < 1. This 
distinction becomes necessary when we consider infinitely exchangeable processes of AB type later on. 

The rest of this paper is organized as follows. In section [3) we discuss general conditions under which the 
AB algorithm gives rise to an infinitely exchangeable tree- valued process. Section [4] introduces an algorithm 
on set partitions, the cut-and-paste (CP) algorithm, and draws parallels to an infinitely exchangeable partition- 
valued process in [11]. Section [5] shows some special properties of the associated tree- valued process based on 
the combination of both the AB and CP algorithms. 



3 Infinitely exchangeable processes 

Infinitely exchangeable random partitions and partition-valued processes have been studied in some detail in 
the literature. Ewens [15] first introduced his sampling formula as a model in population genetics, which 
was later studied as a process on set partitions by Kingman IPT71 and several others. Coalescent processes 
lfT3l[T9ll20l , fragmentation processes [5J[6j[7J[2]]|, fragmentation-coalescence processes (DOG!, and other gen- 
eral processes [11] are partition- valued processes for which conditions for infinite exchangeability have been 
discovered. Given the form of the finite-dimensional transition probabilities in £0) and its apparent relationship 
to partition-valued processes, we study conditions under which this tree-valued process is infinitely exchange- 
able. 



3.1 Exchangeable ancestral branching Markov kernels 

A collection of Markov kernels Q := {Qa(-, •):ACN}on(^,AcN) is finitely exchangeable if for each n > 1, 
A,B C N with #A = #B = n, and t € ST A 

Q A (t,-) = QB((p*(t),<P*(-)) (3) 

for every one-to-one injection map cp : A — > B, where cp* : 8?a — > .J'b is its associated injection ^ — > .S?b- In 
other words, Q B = Qa^P* 1 , the distribution induced on S'b by Qb and the injection (p. In this case, there exists 
a map Oa'-A^ [n] such that Qa(-, •) = G«(Oa(')> °a( - )) =: 2«°a( v)> the exchangeable transition probability 
function for n. 

We define the canonical injection A — > [n] as follows. Suppose, without loss of generality, that A = 
{a\ ,... ,a n } with a\ < a2 < ■ ■ ■ < a n . Then we define the canonical injection by <p A '■ A — > [n],a; h-> i. For 
each A C N such that #A = n, we have Qa(-, •) = Qh^a(-, •)• Therefore, for a finitely exchangeable family of 
Markovian transition probabilities, we need only specify a transition probability Q n (-, •) on ET n for each n>\. 

Theorem 3.1. Let n>\ and for each A C N with #A = n let Qa{- , •) be a branching Markov kernel on 8?a defined 
by the family {P£ : S C A}, where P£ := {ps(B, •) : B G £&$ := &s\{ls}}- Assume further that for every finite 
A,B CN with #A = Wand injection yr* : <?> A -> g? B , p A {n,l A ) = /?B(y/"*(7r),l B ). Then the family {Q A :A CN} 
is finitely exchangeable if and only if the restricted collection {P£ : S C N} is finitely exchangeable. 

Proof Let A C N be a finite subset and P := {P$ : S C A} be some family of Markov kernels on {&s '■ S f= ^}- 
From (O, the AB Markov kernel on STa based on P is 

Qa(T,T')= [\ 



beT':#b>2 l ~Pb( U T lb ,h) 

For A,B C N with #A = #5 and injection map (p : A — > B with associated injection <p* : 3?a — > ^b, we also 
write (p* to denote the associated injection &a —* &b, which should cause no confusion since it is clear from 
context to which we are referring. 

For n = 2 and A,B C N such that #A = #B = 2 with injection map (p : A — > B and associated injection 
\jf* '■ 8?a — > &b- In this case, #&a = #&b = 2 so that we can write A\ ,A 2 as the elements of & A with #A\ = 1 
and #A 2 = 2. Likewise, we write fii and B 2 for one and two block (respectively) elements of i^g. Hence, 
y*(A;) =5,- for i= 1,2. It is assumed that p B (\j/*(7t),l B ) =/?a(#,1a) for each 7T<G &&• Hence, p B (y* 00 > Ifi) = 
p B {v*{x),Bx) = p A {x,M) and 1 - p B (\if*(n),Bi) = p B (w*(3i),B 2 ) = Pa{t*M) = 1 -j?a(Mi) an d p B = 



PaY*^ 1 for #A = #B = 2. So {p A (-, •) : #A = 2} is exchangeable. Also, #5a = #^s = 1 implies for t € 5a, 
QA(t,t) = <2b(V*W> V*(0) = 1 trivially. So we have that {<2a(-, ■) : #A = 2} is exchangeable. 

Now, fix n > 2 and suppose that for any pair A,B C N with #A = #B < n and any injective map q> : A — > B 
we have that Q B = Ga<P* _1 implies that p B = Pa<P* _1 on 3?* B . Now, consider A*,B*cN with #A* = #5* =n + \ 
and let i/a : A* — > B* be the unique injective map A* — > B* whose restriction to A — > B corresponds to (p. Write 
Y* ■ 37a* — > ^b* for its associated injection 2? A * ->• 5#*. 

Assume that Q A * = Qb*V* and let ?,?' G 5a*. We have the following. 

QAtA = PA f;^ ; n &m>) w 

PB*(n r(0 ,n r(f0 ) 

1 m T^ J I QbW V)\b,W V )\b) p) 

l-/>fl.UV( t ),lB*J fo e ry ( ,, ): #fo>2 

PB*(n r(0 ,n r(O ) 

i-Mn r(0 ^0 6 4W '^ V ' W (6) 

- p »- (n "-*)' n r(0) n &Mi) (7) 



which implies that 



i-7j B *(n rW ,i B .) foen; . #fo > 2 



1-/7 A .(^,1 A .) l-/»fl.(V*(»),lip) 

for all one-to-one functions y* : & A * — ^ &B* and all k,k' G ^aAIIa*} =: &\*, which establishes that 

PA*0,<) = p B >(xf{%),xf(%')) 

by assumption that Pb*(V*(-),1b*) = Pa* (-,1a*) for all A*,B* such that #A* = #5* and any injective mapping 
y : A* — >B*. 

This establishes finite exchangeability for {.Pa(v) : #A < « + 1}. Induction implies this holds for all n > 1 
and hence also implies finite exchangeability of {/?a(v) : A C N,#A < oo}. 

The reverse implication is obvious. In fact, if {P| : 5 C N} is finitely exchangeable, then pa(k,1a) = 
Pb{y*(k),\b) for any A, B with #A = #fiand any injection y/"* : & A — s> & B . So the additional assumption in the 
statement of the theorem is implicit. □ 

Theorem 13.11 establishes a correspondence between collections of exchangeable Markov kernels on &>ua 
such that p n (B,l n ) < 1 for each n > 1 and exchangeable ancestral branching Markov kernels on 2F n . For 
all practical purposes, it is sufficient to have an exchangeable Markov process on &\ n ]- There are several 
known results for exchangeable processes on the projective system («^y ,n> 1), e.g. exchangeable coagulation- 
fragmentation (EFC) process [4], the CP(v)-Markov process [1 1 J which we shall call the cut-and-paste process 
in light of the exposition in section [4] and any properties of the induced 5*- valued process associated with either 
of these are of interest. As we see in section |5l the the cut-and-paste ancestral branching process lends itself to 
certain extensions. 



3.2 Consistent ancestral branching kernels 

Let A C N. A family of Markov kernels {Q$ : S C A} defined on the projective system {Jj : S C A} is consistent 
if for all / C C B C A, f G 5c and t* G D^(f). 

&££>, •) := Q B (t\D c ] B (-)) = Q c (t, ■). (8) 

In other words, for any C C B and injection (p :C — >■ B with associated projection <p* : ^ — )• Jfc, we have 

Theorem 3.2. Let Q := {Q$ : S Q A} be a family of ancestral branching Markov kernels based on a collection 
P := {P$ : S C A}. The family Q is consistent if each ps{ft, ■) « consistent for all % such that % 7^ l s . 

Moreover, if in addition, ps*(lZ*,£s*) + Ps*{^*As*) = PsfaAs) f or every S C S* with #S = #S* — 1 and 
every n G 2?$ and n* G D^ s „ {n), then Q consistent implies ps( v) i s consistent for all SCA. 

Proof For S C A and jt G 5 n A c , write S* := S U {x}. 

Suppose Q is consistent and ps*(7T*,es*) + /?s*(7T*,ls*) = ps(^jls) f° r every % G ^5 and tt* G D s 1 sx (tc). 
Then we show that P is consistent by induction. For S CA such that #5 = 2, we have that ^5 contains exactly 
one element, which we denote ft. It is clear that Qs{ts,ts) = 1 and for any fCAwe have that 

t"eD- [ sx (t s ) t"e,%, 7ie^ s *\{ls*} qsJ[ ''i s *> 

for any ?* G 3?s* by the fact that Qs* is a transition probability. By our assumption, we have 

i-/7 5 ,(n t *,i s *) = £ MiV,*F) + [w(n,,is)-p*(iV,isO] 
Ps {n t ,n t ,)+ps{n t ,i s ) = £ p*(rv,*)+ps(n*,is) 

;rez)-j,(n f ,) 

and ^5 is consistent with ^^ for #5 = 2 and S*. 

Now, for each 5 C A with #S = m <#A, assume that pr(-,-) is consistent for all r C S, and let S* = 5U {x} 
for some x G A n 5 C . Assume f , f' G 5J and let ?* G D^L (t). For a partition 71 G ^s and b £ n, write &* G n* G 



D<, ijn) to denote the block of n to which x is added to obtain n*. We have 



's,s* 



t"eD-^(t') 



(9) 



Ps*(n t *,n*) 



^eD 5 J,(n f ,)r"eZ)-; r ( ? ')^( n '*' ls - v ne^:^ L J «*(IV,lsO 



p S x(H t *,es* 



Ps *(Jl t *,n*) 



p S x(U t *,e S x 



7t*eD^ 



n* eD,, 



- m 1 ^ 11 6wi*>'|w L Qb<t^,t lb ,) + ————Q s {t,t 



p y (lV,ff*) 



As>(n f *,e s *) 



(n ) qs * " f ' S * ' ben QS X \ [ L t* ' *■& J 



) (11) 



(12) 



Qs(t/) 



py-(n f *,e 5 -»-) 

^-v(n f *,i 5 -v) 



w*eDjL (n,,) 



p y (n,«,^*) gs(iyi s ) 
^(n^,isO^(n„n f o 



(13) 



Here, (TTOb follows by noticing that the restriction f£ and tL is unaffected unless Z? = b x and that f" £ £> 5 5 * (f ') 

can be broken down into a sum over 7T* € D^^ (IV ) and a sum over trees in the inverse image of the reduced 
subtree tL x . Line (TTTb follows by bringing factors that do not depend on b x outside of the sum. Line (fT2l follows 
by the induction hypothesis that Qb is consistent for all b QS. And line (PT3l follows by the recursive expression 
of CD- 
Consistency requires that JL / / eD -i /,/-, Qs x {t* ,t") = Qs(t,t') f° r all f * G ^ss*(0 an( l hence we must have 

S.S X \ ' ' 



#s*(n,.,is*) 

above, which is equivalent to 



+ 



3T*eD S j. v (n r / 



gHiWO gs(n„is) 



py (iV,e y ) + -^^ £ p^Civ.O = ^(n f *,i s ,). 



Suppose that I^^-i (n;) /? s *(IV,7z;*) / p 5 (n f ,IV), then 



?s(n»)ls) v /-,-r *n / /t-t t ■> 



jz*eDc 



and 



Pp(nc,e»+i) 



p*(iv,k*) ^(n ( ,i s ) 



^(iV.v) ^^(n,,) ^(n^,i y ) p s (n,,rv 



7^1 



by the assumption that pj ■(•,•) is consistent for all T C 5 and our additional assumption. Hence, we conclude 
that consistency of Qs and Q$x, along with our additional assumption, implies that ps(n,-) and ps*(7t*,-) are 
consistent for all % G g? s with #71 > 1 and n* € D s 1 sx (k). 

Reversal of the above argument shows that consistency of ps{ft, •) for % with #71 > 1 is enough for Qs to be 
consistent in (PT3l) . □ 

A priori, it is not obvious that either of the implications in the above theorem must hold, and it is potentially 
useful to know that a consistent family of partition- valued transition kernels is sufficient to construct a consistent 
family of tree-valued processes by the AB Algorithm, provided that p(B, 1) < 1 for every B. 
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Infinitely exchangeable kernels A tree- valued process (Tj,j > 1) on & is infinitely exchangeable if its finite- 
dimensional distributions are both finitely exchangeable for every n > 1 and consistent. More precisely, for 
each n > 1 let F n be a probability measure on 2? n and let F := (F„,n > 1) be the family of finite-dimensional 
distributions on (=5^,n > 1). The collection of spaces (<%,« > 1) forms a projective system, i.e. for every 
m <n and injection map (p mn : [m] — > [n], there is an associated projection cp* n n : £T n — » 3F m . The collection of 
fmite-dimensional measures F is infinitely exchangeable if for each m<n and injection map (p m>n 

I'm = t'n^Pm^n • 

That is, the measure induced on 3F n by % lM , F n (p*~ n l , corresponds to F m . 

A family of Markov kernels {p n (-,-),n > 1} is infinitely exchangeable if p m (t,-) = p n (t* , (p^ n l (•)) f° r an 
m <n and injection maps (p m>n : [m] — > [n] iflOl . Putting together theorems 13.11 and [3T2l we arrive at a condition 
for the infinite exchangeability of Q in terms of associated partition-valued Markov kernels. In particular, 
if {ps '■ S C A} are finitely exchangeable and consistent, and p$(-,ls) < 1 for every S, then 2 is infinitely 
exchangeable and there is a unique transition measure <2°° on &, the space of fragmentation trees of N, such 
that for every n > 1 and t,t r 6 2T n , 

Qr{t,t') = Qr{r,{f e£r : q n] =t'}) 

for any t°° G {t* : tT,, = t}. The coalescent process does not satisfy this condition because it becomes absorbed in 
the one-block state almost surely, but other known processes do, e.g. exchangeable fragmentation-coalescence 
(EFC) processes [4] and p v -Markov processes [11]. We now turn our attention to the p v -Markov process. 



4 Cut-and-Paste algorithm 

We now consider an algorithm for generating a random sequence of set partitions. A special realization of this 
algorithm has been presented in ifTTTl . which is called the p v -Markov process for its connection to the paintbox 
process of Kingman [18]. Here we outline a more general algorithm, which we call the cut-and-paste (CP) 
algorithm, and we shall henceforth refer to the aforementioned p v -process by the more descriptive title of cut- 
and-paste process with parameter v, or CP(v) process. 

For A C N, let {Pi, : b C A} be a collection of probability measures on ^ for each b C A, and let p. denote a 
probability measure on an at most countable set of labels, which we without loss of generality take to be the set 
of natural numbers N. Given a set partition % := {7Ti,. . . ,7^} € &a, we generate %' € £?a by the cut-and-paste 
algorithm as follows. 



Cut-and-Paste (CP) Algorithm 

(i) Generate independent random partitions C\ , . . . , Q, where for each i=l,...,k,C,:= {C,j , . . . , C,^. } ~ P Ki 
is a random partition of block %i of %, and we list the blocks of C, in order of appearance. 

(ii) Generate independent random permutations a\ , . . . , C\, where for each z = 1 ,...,£ a, is a uniform random 
permutation of [#C,] = \k\\ . 

(iii) Independently for each i = 1, . . . ,/c, generate m, := (wi,i,. • . ,ni ( ',yt,) by drawing without replacement from 
p (a size-biased ordering of the atoms of p) and assigning label m ia .(^ to block Cjj of Q. 



(iv) For each I € N, put n[ = {dj : mwj) = I}, the collection of blocks of Ci, . . . ,Q which are labeled / in 
step (iii). 

(v) Put %' := \%\ : I € N}\{0}, the non-empty collection of %[ from step (v). 

The name cut-and-paste is derived from steps (i) and (iv) of this algorithm which involve, respectively, 
cutting (partitioning) the blocks of % independently according to some measure and then pasting (coagulating) 
blocks which are assigned the same label in step (iii). This procedure can be synthesized in the form of a k x N 
matrix, a generalization of the matrix construction of the CP(v) process in ifTTTl . for any k = 1,2, ...,°°, as 
follows. 

Let 71, C\,. . . ,Q,(Ti, . .. , o&,mi, . .. , mj.be as above. Write tnOi(j) \=m^ a n\ and (mc,) (Z) := {j :m ia .^ = 
I}. If (ffld,)" 1 (Z) = then we write Q ( m01 )-im = in what follows. Then put %' equal to the non-empty column 
totals of the matrix 

C.i C.2 ... C.j 



1Z\ 


/ C l,(mOi)- l (l) 


*-l.(mffl)" 


-'(2) • 


•• C l,{mOi)-t{j) 


"\ 


Til 


^■2,(ma 2 )-'(l) 


^2,(ma 2 )' 


-1(2) • 


•• C 2,(ma 2 )- l {j) 




% k 


\ C k,(mo k )-i(l) 


Ck,(ma k y 


-'(2) • 


■■ Ck,(mo k )-^j) 


...j 



That is %' := {nj : I = 1,2, . . .}\{0} where %\ = ULi C i.(mc,)-\i) for each Z = 1,2, 

The above procedure is pretty flexible, and the full extent of processes which are generated in this way 
remains to be seen. A particular process which arises according to a special case of the CP algorithm is the 
CP(v) process where the measure n is assumed to be the uniform distribution on [k] for some k > 1. This 
generates a tree-valued process, the cut-and-paste ancestral branching process, with some special properties, 
which we now discuss. 



5 Cut-and-paste ancestral branching processes 

Above, we have studied the general formulation of both the ancestral branching algorithm on & and cut-and- 
paste algorithm on 9* and have shown some general relationships between exchangeable and consistent Markov 
kernels on partitions and their corresponding AB kernels on 3?. We now turn our attention to a particular family 
of partition- valued Markov processes which we previously studied in ifTTJ . First, we discuss some preliminaries. 

Let ^m = {{s\, $2, ■ ■ •) : S\ > S2 > . . . > 0, £,•£,• < 1} be the space of ranked-mass partitions. For s € & m , 
let X := (X\,X2, ■ ■ ■) be independent random variables with distribution 

W s (Xi = j) = < 

The partition IT(X) generated by s through X satisfies i ~n(x) j if an d only if X, = Xj. The distribution of IT(X) 
is written p s and IT(X) is called the paintbox based on s. For a probability measure v on &* m , the paintbox 
based on v is the v-mixture of paintboxes, written p v (-) := fg» p s {-)v(ds). Any partition obtained in this way 
is an exchangeable random partition of N and every infinitely exchangeable partition admits a representation as 
the paintbox generated by some v. See [H and J22l for more details on the paintbox process. 
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s Jl 


7>1 


— lOO 

lk=l s k, 

0, 


j = ~i 
otherwise 



For any probability measure v on 3^ m := {s € £? m : Sj = Vj > k, Y, s j = 1}> the ranked k-simplex, 
let Pv(-) be the paintbox based on v as described above. For each n > 1, define finite-dimensional transition 
probabilities on ^v by 

jfel (k-#B\ h )\ 

"•^■■^Wwv-U— « JS -"'W- (14) 

The collection (/?„(-, -;V),n > 1) of transition probabilities characterizes an infinitely exchangeable Markov 
process on ^>M, called the cut-and-paste process with parameter V, CP(v) -process, under the usual deletion 
operation D„ )B+ i : ^Wi] ->■ &\ n ],B H- D„, n+1 (B) := B| W |Q3). 



The transition mechanism on ^' characterized by the finite-dimensional transition probabilities in (fT4l ) 
admits the following useful construction. Let B € £?^\ C := (Ci,...,Q) be i.i.d. p v paintboxes and a := 
(<7i , . . . , Ok) be i.i.d. uniform random permutations of [k]. Construct the matrix 



C.i C.2 ... C.k 

B \ / c i,cn(i) nB i Q.CTi(2) n5 i •■• C l,0l(k)^ B l\ 

B 2 c 2 ^ 2 (i)r\B 2 c 2if72 (2)nB 2 ••• c 2)0 . 2 (£)nfi2 



-:BnC a . 



We write CP(B,C,a) := j(J*=i( 5 yl~lC} )a .(;)),l < / < ^[\0 to be the partition whose blocks are given by the 
column totals of B n C° '. This formulation coiTesponds to the finite-dimensional transitions in (fT4l ) and can be 
used in an alternate specification of the ancestral branching algorithm based on these transition probabilities. 

For n > 1 , k > 2 and v a probability measure on ^^ , let p n (-,-', v) denote the CP( v) transition probability 
on ^ / in (fl4l ) and g„ ( • , • ; v) = 1 —/?„(•,•; v) its complementary probability. The family {/?„ (• , •; v) : n > 1 } is 

infinitely exchangeable and so defines a unique transition probability Pa(-, ', v) on ^ A for each A C N by 

Pa(-,-;v):=^ # a(-,-;v) 

for #A < oo and pa(-,-;v) = p^(-,-', v) otherwise. 

Furthermore, for v non-degenerate at (1,0, ... ,0) we have that pb(-, 1&; v) < 1 for all ft C N with #ft > 1, 
and so (Q~|) is well-defined and the results of section [3] hold. In particular, the ^-valued process induced by the 
finite-dimensional transition probabilities (fl4l ) and the ancestral branching algorithm is infinitely exchangeable. 

For the AB algorithm based on the transition probabilities of the CP(v) process, we can describe an alter- 
native, though equivalent, formulation which is helpful in later sections. 



5.1 Alternative construction of the cut-and-paste ancestral branching Markov chain 

We introduce a genealogical indexing system to label the elements of tA £ S'a (chapter 1.2.1 of Bertoin (U) as 
follows. 

We write 

oo 

<% := (J N" 

n=0 

to denote the infinite set of all indices, with convention that N° = {0}. 

11 



For a fragmentation tree T, the nth generation of T is the collection of children t G T such that #anc(/) = 
n— 1. For each m = (mi,...,w„) = u\U2---u n G ^, n is the generation of m. Write u— := (wi,... , w«-i) to 
denote the parent of u and ui := (w, /) := (u\, . . . , w„, /) for the /th child of m. As we are working in the context 
of fragmentations of subsets of N, the /th child of t G T is the /th child to appear in a list when the elements of 
frag(z), the children of t, are listed in order of their least element. 

A Markov chain on «^w which is governed by the same transition law as in the previous section can be 
constructed by a genealogical branching procedure as follows. 

Let k > 2 and v be a probability measure on 0P m ' which is non-degenerate at ( 1 , 0, . . . , 0) . For T, T' G 8T^ , 
the transition T H > T' occurs as follows. Generate {B u : u G ^} i.i.d. p\, partition sequences, where p v := 
p v <8> • • • <S>Pv is the product measure of paintboxes based on v, and {a" :u£^} i.i.d. ^-tuples of i.i.d. uniform 
permutations of [k] . 

Genealogical Branching Procedure 

(i) Put n r = CP(n r ,fi ,a ), the partition obtained from the column totals of n r n {B % ) a% , as shown in 
section 01 

(ii) for A" € T, put A uj equal to the ;th block of CP(U Taii ,B u , o") listed in order of least elements. 

In other words, each B u is an independent &-tuple of independent paintboxes based on v and we index this 
sequence just as we index the vertices of a tree. Likewise, each a" is an independent &-tuple (a", . . . , G^') of 
i.i.d. uniform permutations of [k]. The next state T is obtained from T by a sequential branching procedure 
which starts from the root and progressively branches the roots of the subtrees restricted to each child of T' . The 
children of T' are given by {A" ,u E ^} and for each n > 1 the restriction to [n] of T' is Zjf, = {A"n [n],u € *&}. 

The genealogical branching procedure simultaneously generates sequences of trees on 8F n for every n > 1. 
It should be plain that this construction is equivalent to that in section [5] since it uses the matrix construction 
of the CP(v) transition probabilities on £P A . The benefit to this construction is that it gives an explicit recipe 
which will be employed in the proofs of various properties of this process in later sections. For completeness, 
we provide a proof that the finite-dimensional transition probabilities of this process coincide with (fl3T >. 

Proposition 5.1. Let T i— > T G ^"W be a transition generated by the above genealogical branching procedure. 
For n>\, the finite-dimensional transition probability of the restricted transition 7]r„i \- > T/,, is 

6„(r,r';v):=n J i v , • ( 15 ) 

her lb(^T ]b ,h;v) 

Proof. Write p n {-,-) = Pn('jSV) and q„(-,-) = q n (-,-;v). For n > 1, the branching of the root of 7]f , given 

7][„] is given by Aty for w(m) G ^ such that u(m) = (1, . . . , 1,0, . . .) and m is the smallest m>\ such that 

m times 
AJ'r^| ^ {[«],©}, i.e. the first non-trivial partition of [n] obtained by the above procedure. The distribution of the 
branching of the root of 7j| i given 7jr n i obtained in this way is 



Pni^-Tu ,,n r / ) 

Yp n (u Tl ,,,u P )p n (n Tll] ,i n )' = — r-^ — r L - 

i=0 yn^^Mf^nJ 
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By independence of the steps of the procedure, we can write the distribution of the transition T \-> T 
recursively as 



Iterating the above argument yields (U5I) . D 



5.2 Equilibrium measure 

The form of Q n (T, T'; v) in (PT5T ) is a product of independent transition probabilities of the branching at the root 
in each of the subtrees of T' . It is known that for v non-degenerate at (1,0, ... ,0) G &m , Pn{-,-\v) has a unique 
equilibrium distribution for each n > 1 [11]. Since p n (B,B';v) > for every n > 1 and B,B' G ^,, we have 

that Q n (t,t'',v) > for all t,t' G <% and so each Q n (-,-',v) is aperiodic and irreducible for non-degenerate 
V G <^m ■ The following proposition is immediate. 

Proposition 5.2. Let v be a probability measure on ^V such that v((l,0, . . . ,0)) < 1 and let Q n (-, •; v) be 
the CP(v) -ancestral branching Markov kernel, then there exists a unique measure p„(-;v) on <% which is 
stationary for Q n ( , r'^ v )f or each n > 1. 

It is easy to see that the above proposition can be generalized to general Markov chains by modifying the 
above condition on V ^ (1,0, ... ,0) to state p n (B,B) > for every n > 1 and B G ^„] and p n (-, •) is irreducible 
for every n > 1. 

The existence of p„(-; v) and the finite exchangeability and consistency of Q n (-, •; v) for each « > 1 induce 
finite exchangeability and consistency for the collection (p„(-; v),n > 1) of equilibrium measures. 

Proposition 5.3. Let (Q n (-,-)i n ^ 1) be an infinitely exchangeable collection of ancestral branching Markov 
kernels (CQ) on (£? n ,n > 1) and suppose for each n>\ p„(-) is a unique stationary distribution for Q n (-,-)- Then 
the family (p n (-),n > 1) is infinitely exchangeable. 

Proof For T" G ,% +l 

Pn + i(T")= I p n+1 {T*)Q n+l {T*,T") 

by stationarity. 

Let T' G 2T n and for 1 < m < n let <p : [m] — >■ [«] be an injection with associated projection (p* : % —> 3~ m . 



13 



Then 



I Pn{T") =11 Pn{T*)Qn{T\T") (16) 

T"e(p*- l {T') T"zq>*- l (T')T*e& n 



{p„r- l )(r) 



I I Pn(T) 



TeS?,„T*£(p*- l {T) lT"E(p*- l {T') 



E Qn{T*X 



(17) 



Q„r- { {T,T')=Q m {T.V) 

E 0»(r,r') E p„(r*) (18) 

re,5;„ r*e9*-i(r) 



Pnf*- l (T) 

= E (p„<p* _1 )(r)Gm(r,r')- (19) 

Te,%, 

The expression in ( fTTT ) follows from ( fT6l) by changing the order of summation and noting that each rGJ B 
corresponds to exactly one T € ST m through the mapping <p*; (fT8T ) follows from (fTTT ) by the consistency of <2»("> •) 
for each n > 1; and ([T9l follows ( fT8l > by the definition of induced measures. Hence, the induced measure p n q>*~ 1 
is stationary for Q n . By uniqueness, p n (p*~ l — Pm for every injective mapping (p : [m] — > [n]. Hence, (p«,n > 1) 
is an infinitely exchangeable family of measures on {3F n ,n> 1). D 

The existence of an infinitely exchangeable equilibrium measure p(-) on N-labeled trees, <?, is a direct 
consequence of the finite exchangeability and consistency of the system (p n (•) , n > 1) shown in proposition 15.31 
and Kolmogorov's extension theorem [9]. In this case, the measure p(-) satisfies 

Pn (T n )=p({Te^:T [[n] =T n }) 

for every n > 1 . 

The above results for the equilibrium measure p(-) apply specifically to the CP(v) ancestral branching 
process under the condition that v is non-degenerate at ( 1 , 0, . . . , 0) € 3? m . 

Corollary 5.4. For v non-degenerate at (1,0, ... ,0) € &m , the collection of stationary measures (p n (-;v),n > 
1) m proposition \5. 2] is infinitely exchangeable. 

Although the existence of a unique stationary measure on 3T^ is implicit in the construction of the tran- 
sition at the beginning of this section, the form of the finite-dimensional and infinite-dimensional stationary 
measure remains unknown. Note that, though the transition probabilities CQ) are conditionally of fragmentation 
type, i.e. given T and b G T' the children of b are distributed independently of the rest of T' , the equilibrium 
measure need not be of this form. Furthermore, it is of interest whether or not some subclass of the CP(v) 
ancestral branching Markov chains is reversible and, if so, under what conditions this property holds. 



5.2.1 Continuous-time ancestral branching process 

An infinitely exchangeable collection (Q n ,n > 1) of ancestral branching transition probabilities can be embed- 
ded in continuous time in a straightforward way by defining the Markovian infinitesimal jump rates r„(-, •) on 

r " [I ^ j ~\ 0, otherwise, UUj 
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for some X > 0. 

Definition 5.5. A process T := (T(t),t > 0) « an ancestral branching Markov process if for each n>\, the 
restriction 7jui := (7! r„i (/),?> 0) « a Markov process on 3~ n with infinitesimal transition rates r n (-,-). 

A process on & whose finite-dimensional restrictions are governed by r„ can be constructed by running a 
Markov chain on S7 n governed by ( fT5T ) in which only transitions T h-> T' for T ^ r' are permitted, and adding 
a hold time which is exponentially distributed with mean l/[X — Xr n (T, T)]. The following proposition is a 
corollary of theorems I3.ll and l3.2l 

Corollary 5.6. For measure V on &m > the collection (R%,n > 1) of finite-dimensional Q-matrices based on 
(1201 ) are consistent. 

The existence of a continuous-time process with embedded jump chain governed by (fT3T > is now clear by 
the corollary I5.6l and the discussion at the end of section [3] 

Theorem 5.7. There exists a continuous-time Markov process (T(t),t > 0) on ^"w governed by Q v such that 

Qn(T,T') = e v (r°,{r" e sr^ ■. T(( n] = r'}), 

for each T°° G {T* G £T(k) : 7j* w] = r}. 

Proof. Corollary 15.61 establishes that the finite-dimensional infinitesimal jump rates (r n ,n > 1) is finitely ex- 
changeable and consistent. Kolmogorov's extension theorem implies the existence of 7? with finite-dimensional 
restrictions given by (r n ,n > 1). Furthermore, for each n > 1 and T G =%, 1 — r„(r, T) = A(l — Q n (T, T)) < X < 
oo so that the fmite-dimensional paths are cadlag for each n, which implies the paths of (T(t),t > 0) governed 
by 7? are cadlag. □ 

The transition rates above are defined in terms of a collection of infinitely exchangeable transition proba- 
bilities (Q n (- , ■) ,n > I). If Q n has unique equilibrium measure p(-), then so does its associated continuous-time 
process. We have the following corollary for the stationary measure of the continuous-time process. 

Corollary 5.8. Let (T(t),t > 0) be a continuous-time process governed by an infinitely exchangeable collection 
(Qn, n > 1) of ancestral branching transition probabilities £[]). Further suppose that for each n > 1, Q n has 
unique equilibrium measure p n and the characteristic measure Qon 3T has unique equilibrium measure p as in 
proposition \5.3\ Then (T(t),t > 0) has unique equilibrium measure p. 

For our purposes, we now restrict our attention to the CP(v) subfamily of ancestral branching processes on 
3~^ with v some measure on £?iJ for some k > 1. We index transition measures and stationary measures by 
V to make this explicit. As we show, the CP(v) associated ancestral branching process is a Feller process and 
has an associated mass fragmentation process. 



5.3 Poissonian construction 

A consequence of the above continuous-time embedding and the alternative specification of the cut-and-paste 
ancestral branching algorithm given in section 15.11 is yet another alternative construction via a Poisson point 
process. 
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Let P = {(t,B u : u G fy)} C M. + x Ylue 1 ?/ [lly=i ^^] be a Poisson point process with intensity measure 

cff (8) A ®„ e ^ Pv , where pi ' is the product measure p v <g) • • • ® p v on ]Tj=i ^^. So for each (f,B") G P, 

B" := (B", ...,Bf) G 11/— 1 & is distributed as Pv and is labeled according to the genealogical index system 
of section 15. II 

Construct a continuous time CP(v)-ancestral branching Markov process as follows. Let % G 3*( k > be an 
infinitely exchangeable random fragmentation tree. For each n > 1, put 7]y(0) = Tim and for t > 

• if ? is not an atom time for P, then Zjr„i (?) = 7jr„i (f— ); 

• if ? is an atom time for P so that (t,B u : m G <%) G P, generate a := (a" : w G W) G ELe^ [Oy=i «^fc] » 
an i.i.d. collection of ^-tuples of uniform permutations of [k]. Put T := T(t—) and T equal to the tree 
constructed from T, {B u : u G ^} and a through the function CP(-, •, •) which is described in section I5TT1 

If T \[n] ^ T \[n}' P ut T \[n](t) = T( [n] ; otherwise, put 7] [n] 0) = 7] [b] (?-). 

Proposition 5.9. The above process T is a Markov process on 3*^' with transition matrix Q v defined by theorem 

Proof. By the above construction, for every n > 1 and t > 0, 7jr B i (t) evolves according to r v n in d20l >, D mA Tit n i (t ) = 
Tu n At) for all m < n, and 7jbi(0 € A7p(^|[«](0) f° r au P > n. Hence, the restriction Jjr M ] is a (^-governed 
Markov process for each n > 1 and the result is clear by consistency of <2» ■ D 

By ignoring the arrival times in the above Poissonian construction and looking only at the embedded jump 
chain, we obtain a discrete-time process which evolves according to the CP(v)-ancestral branching algorithm 
of section 12. II 



5.4 Feller process 

In iTTTTl we show that the cut-and-paste process with finite-dimensional Markovian jump rates corresponding 
to the transition probabilities in (fl4)) is a Feller process. Indeed, we now show that the ancestral branching 
Markov process on & which is induced by the CP(v) Markov process is also Fellerian, but we first need some 
preliminaries. 

Define the metric d : 5* x S* ->■ R + by 

d(T,T') ■= 1/maxlrc G N : T [[n] = T( [n] }, (21) 

for every T,T' G 2? , with the convention that 1/°° = 0. 

Proposition 5.10. d is a metric on 5* ' . 

Proof. Positivity and symmetry are obvious. To see that the triangle inequality holds, let T,T',T" G 3* so that 
d(T, T') = \/a for some a > 1. Now suppose that d(T, T") = \/b > l/a. Then the triangle inequality is trivially 
satisfied. lfd(T,T") = \jb < l/a then 7]^ = T/L for b > a and 7][ a ] = Jjj , but 7j[ a +i] 7^ ^jL+ii by assumption. 
Hence, d(T',T") = l/a and the triangle inequality holds. □ 

Proposition 5.11. (^,d) is a compact space. 
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Proof. Let (T l ,T 2 ,...) be a sequence in ST. Any element T G 3T can be written as a compatible sequence 
of finite-dimensional restrictions, T := (Zjm,Zjra,...) := (Zi,^,...). The set S? n is finite for each n, and so 
one can extract a convergent subsequence (r^^T^, . . .) of (T l ,T 2 ,. . .) by the diagonal procedure such that 
d(T®,TW) < \/mm{i,j} for all ij. □ 

Lemma 5.12. Cf := {/ : & ->■ R : 3n G N &/. rf(r, T') < 1/n =* /(r) = /(r')} w <ierc«> w tfje space of con- 
tinuous functions ST — >-R under the metric p (/,/') := sup TG ^ |/(t) — /'(t)|. 

Proof. Let <p : ^ — )• R be a continuous function. Then for every £ > there exists n(e) G N such that T, a G ^ 
satisfying d(x,a) < l/n(e) implies |<p(r) - <p(c)| < £. 

For fixed £ > 0, let N = n(e) and define / : 2? — > R as follows. First, partition ST into equivalence classes 
{tG.T: Tim = t\w\ } for each t E &. For each equivalence class £/, choose a representative element «G(/ and 
put f{u) := <p(«) for all u GU. For any t G J^, let f denote the representative of t obtained in this way. Hence, 
f{t) = f(t') = /(f) for all t,t' such that d(t,t') < \/N and / G C f . Thus, 

|/(T)-<p(T)| = |<p(f)-<p(T)|< £ 

by continuity of (p and 

p(/,<p) = sup T |/(T)-<p(T)|<£, 

which establishes density. D 

Let P f be the semi-group of a p v -branching Markov process T(-), i.e. for any continuous <p : 5TW — > K 

P t p(T):=E T p(r(0), 
the expectation of (p(T(t)) given T(0) = T. 
Corollary 5.13. A CP(v)-ancestral branching Markov process has the Feller property, i.e. 

• for each continuous function cp : 3T(> — > R, /or eac/i T G ^ one ^as 



limP,<p(T) = <p(f), 
40 



/or all t > 0, x i— » P ? <p(t) w continuous. 



Proof. The proof follows the same line of reasoning as corollary 4.2 in [11]. Let cp be a continuous function 

&& -> R. 

For g G C/, lim f ^ P^(T) = g(r) is clear since the first jump-time of T(-) is exponential with finite mean. 
Denseness of Cf establishes the first point. 

For the second point, let n > 1 and x,x' G ^ k > such that d{x,x') < 1/n, i.e. Tir„i = f', ,. Use the same 
Poisson point process P, as in section 1531 to construct T(-) and T'(-) such that 7(0) = x and r'(0) = X' '. By 
construction, 7jr„i = Tl , and d(T(t),T'(t)) < l/n for all ? > 0. Hence, for any continuous <p, x i-> P f «p T is 
continuous. □ 
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By corollary 15. 131 we can characterize the CP(v) -ancestral branching Markov process (T(t),t > 0) with 
finite-dimensional rates (q n (-, •', v),n > 1) by its infinitesimal generator <S given by 



J &■(.*) 



for every / € Cf. 

Our proof of the Feller property for the CP(v)-ancestral branching Markov process makes use of the Pois- 
sonian construction of the previous section. In light of the specification of the cut-and-paste algorithm in section 
|4j it is straightforward to see that we can construct a generate cut-and-paste ancestral branching process via a 
Poisson point process by slight modification, and the various properties shown for the CP(v)-ancestral branch- 
ing process herein may also apply to general cut-and-paste ancestral branching processes. This is beyond the 
scope of the current paper, as we are principally interested in establishing properties of the CP(v)-ancestral 
branching process on ^w. 



6 Mass fragmentations 

A mass fragmentation of x € M + is a collection M x of masses such that 

(i) i£Mj and 
(ii) there are m\ , . . . ,rrik £ M x such that £* =1 m, < x and 



M x = {x}UM mi U---UM, 



m k ■ 



We write ^ x to denote mass fragmentations of x. Essentially, a mass fragmentation of x is a fragmentation tree 
whose vertices are labeled by masses such that the children of a vertex comprise a ranked-mass partition of its 
parent vertex. The case where children {mi, . . . ,m^} of a vertex m satisfy Ya=\ m i < m * s called a dissipative 
mass fragmentation. Herein, we are interested in conservative mass fragmentations which have the property 
that the children {m\ , . . . ,m^} of every vertex m E M x satisfy Ya={ m i = m - It i s plain that ^ x is isomorphic 
to ^#i by scaling, i.e. M x = x^\ and so it is sufficient to study ^\. See Bertoin [8] for a study of Markov 
processes on j%\ called fragmentation chains. Here we construct a Markov process on M\ which corresponds 
to the associated mass fragmentation valued process of the CP(v)-ancestral branching Markov process on ^ k >, 
which has been studied in previous sections. 

Definition 6.1. A subset A C N is said to have asymptotic frequency X if 

A:=lim #^nW) 

n— ><>o n 

exists. 

A partition B = {B\,B2, ■ ■ .} 6 & is said to possess asymptotic frequency ||B|| if each of its blocks has 
asymptotic frequency and we write ||fi|| := (|[Bi|[,...) G & m , the decreasing rearrangement of block frequen- 
cies of B. According to Kingman's correspondence [19], any infinitely exchangeable partition B of N possesses 
asymptotic frequencies which are distributed according to v where v is the unique measure on & m such that 



6.1 Associated mass fragmentation process 
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Fix k > 2 and let v be a probability measure on 0P m ' . Let J£\ .= {/x G *rft\ :#A<k for every A G /i} be the 
subspace of conservative mass fragmentations of 1 such that each A G tl € 



Ak) 



Ak) 



Ak) 



has at most k children. 
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Construct a Markov chain on J£)^' as follows. For \i G JKy'' , the transition \i i->- p. € M\ is generated 
by an i.i.d. collection S := {s u : u G ^} of V^ mass partitions, i.e. s u := (s", ■■■,sf) G nf=i ^m is an i.i.d. 
collection of mass partitions distributed according to v and s w is independent of s v for all w/v, and £ := {a" : 
m G ^} i.i.d. ^-tuples of i.i.d. uniform permutations of [k]. 



(i) Write ii := {/j." : u G ^} and jft := {/F : u G <2r}. 
(ii) Put /I = 1, the root of p. 
(iii) Given p." £ p., put /I" 7 equal to the j'th largest column total of the matrix 



/i"/i 2 



/A M *l,of(l) ^ ^ S l,Cf?(2) 



AVVAV^of(i) AV<«-(2) 



A M s 2,o?(*) 



AV^rf^y 



i.e.A" ; :=^LiA"M'^ (m) : 
the root of II. 



m 



, fc , where jU 1 , . . . , /i* correspond to the mass fragmentation of 



Definition 6.2. For a fragmentation tree T G ^ ', we wnYe M(T) to denote the associated mass fragmentation of 
T, i.e. the mass fragmentation of 1 obtained by replacing each child ofT by its asymptotic frequency, if it exists. 

Theorem 6.3. Let T := (T n ,n >\) be a CP(v) -ancestral branching Markov chain with transition measure 
Q(-, ■; v) on A7^\ with initial distribution IT some infinitely exchangeable measure on 3?, Let \l := (/!„,« > 1) 
be the Markov chain on ^t\ generated from the above procedure, then M(T) =_£> 11. Moreover, the transition 
measure X (• , • ; v)for }X is given by 

A(At,ti , ;v) = 2(7) J ,M- 1 ( J u');v) 

where T^ is any element 0/M _1 (jU) := {r G ^ k) : M(T) = ll }. 



Proof. Fix k > 2 and v a probability measure on APm . For T ~ Q(-,-',v) we have that for every n > 1 and? G T„, 
the set of children {t\,...,t m } of ? forms an exchangeable partition of {?} C N given 7J,_i and so possesses 
asymptotic frequency ||f|| almost surely by Kingman's correspondence. 

The alternative construction of the Markov chain T with transition measure <2(-,-;v) constructed in sec- 
tion [5j]can a l so be constructed as follows. Let S := {s li : u G ^} be the collection of mass partitions in the 
construction at the beginning of this section. Given S, generate B := {B u : u G ^} G YlueW [iTLi ^ k '] by 
letting B u := (B l {, . . . ,Bf) and B l j ~ p s « independently of all other B]. Constructed in this way, {B u : u G ^} is 

a collection of i.i.d. Pv partitions whose asymptotic frequencies satisfy \\B u -\\ = s" almost surely. Furthermore, 

the unconditional distribution of each B u is p 



(*) 
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Next, we let £ := {a" : u € ^} be a collection of i.i.d. ^-tuples of i.i.d. uniform permutations of [k] and 
generate transitions of T from the alternative construction of section [57X1 based on £ and {B" :«Gf} and 
generate a Markov chain pt on ^#1 based on £ and 5. Then we have the T is a Markov chain with transition 
measure Q(-, -;v) on ( S?^ k \a (Un>i =% ) ) an d, furthermore, by the above construction, we have that the 
associated mass fragmentation chain M(T) := (M(T„),n > 1) is equal to \i almost surely. 

By the three step construction of transitions on ^#i at the beginning of this section, it is clear that \i is a 
Markov chain. Hence, the function M(T) is a Markov chain and so the result of Burke and Rosenblatt |[T0i 
states that it is necessary that the transition measure of M(T) satisfies 



p- 1 (m,m';v)=/ Q(T m ,dt) 

JM-'(m') 



for all T m e Mr 1 (m) \={T ££T; M(T) = m}. 

Finally, since M(T) = /I almost surely, we have that the transition measure A of }i on ^#i satisfies A = 

nMr 1 . D 

Corollary 6.4. The associated mass fragmentation process M(T) exists almost surely. 

6.2 Equilibrium measure 

As in section [5721 suppose v is non-degenerate at (1,0, ... ,0) E ^m • Theorem 15 .41 states that a Markov chain 
T := (T n ,n > 1) governed by Q(-,-;v) possesses a unique equilibrium measure p(-;v). The following theorem 
follows immediately from this fact and from theorem [631 

Theorem 6.5. Let V be a probability measure on ^V such that v((l,0, ... ,0)) < 1. The mass fragmenta- 
tion chain /j. := (/i n ,n > 1) on ^#i governed by QM~ l (-,-; v) possesses a unique stationary measure C(-;v). 
Moreover, for jj, G Jt\ , 

C(M;v) = p(M- 1 ( A t);v) 

where p(-; v) is the unique equilibrium measure ofQ(-, ■, v) on ^ ' from corollarv \5.4\ 

Proof. Let pt be a Markov chain on ^#i with transition measure A(-, •; v) governed by the transition procedure at 
the beginning of section[6l By theorem [631 we have that A = QMT 1 where Q(-, •; v) is the transition measure of 
the CP(v)-ancestral branching Markov chain on ^ k > with unique equilibrium measure p(-;v) from corollary 
El 

Furthermore, it is shown in theorem 1631 that /I is equal in distribution to the associated mass fragmentation 
chain of a Markov chain on ,^^ governed by 7i(-, •; v). Hence, we have 



p(T';v) = |^ ) e(T,T';v)p(JT) 
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and for p' G j$\ 



pM~ l (p';v) = p\Mr l (n);v] 



which shows that £ := pM Ms stationary for X. 



Q(r,dt;v)p(dx;v) 

f e(T,MrV);v)p(^;v) 
QMr\p,p'\v)pWr\dp) 

A(ju,ju';v)pM _1 (rf/x) 

^#1 



D 



6.3 Poissonian construction 

Just as the CP(v)-ancestral branching process on ^ k > admits a Poissonian construction, which we showed in 
section 15.31 so does its associated mass fragmentation- valued process, which we now show. 



Let v be a probability measure on &>£\ Let 5 = {(t,s u ) : u G ^} C M + x I\ ue <% 
point process with intensity dt®X <S>ue^/ v f° r some A > where v^ := V ® ■ • • 
measure on TjLi rf and s u := (jf , . . . ,sf) G flLi rf for each «ef. 



(70 



be a Poisson 



nli ^ 



V is the &-fold product 



Construct a Markov process p. := (p(t),t > 0) in continuous-time on jtf\ as follows. Let jUo be a mass 
fragmentation drawn from some distribution on j$\ . Put p (0) = po and 

• if t is not an atom time for S, p(t) = p(t—); 

• if t is an atom time for S, generate I, := {a" : u G ^} where a v and o w are independent for all v 7^ w and 
a" := (a", . . . , G%) is an i.i.d. sequence of uniform permutations of [k] for each u G ^. Given (t,s u ) G 5, 
cy" and p(t-) = {p" : u G ^}, put ju(f) = {p u : u G ^} where 

1) /i = 1 and 

2) given /I", put /i" ; equal to the jth largest column total of the matrix 



•i. 



'2. 



P U P 1 
P U P 2 



/£ M S l,of(l) A A 4 *1,(7»(2) 
^ ^ 2,cr 2 "(l) M M 2,o?(2) 



p u p k \p u p. k s" {1) P u P k s" {2) 



# A *l,<Tf(ifc)\ 

1*1 " f I C^ 



F^aHk) J 



i.e. £«" := (Iti A"M^ af W ,«= 1, •..,*; . 



Theorem 6.6. Le? T := (T(t),t >0) be a CP(v)-ancestral branching Markov process from section \5.2.1\ and 
let X \= (X(t),t > 0) be the Markov process on ^#j generated from the above Poisson point process, then 
M{T)=j?X. 
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nli rf 



be a Poisson point process with intensity dt 



Proof. Let k G N and V be a measure on & m . 

Let S = {{t,s u ) :ueW}cR + x \\ u ^ 

^"®ueW v f° r some A > as shown above and let X := (X(t),t > 0) be the process on ^\ constructed 
above. Given 5, generate P := {(t,B u ) : u G ^} C M+ x Uue^ [nLi ^ W ] wnere for each {t,s u : u eW) £ S 
we let £" := (£", . . . ,Bf) E nLi ^ W be a &-tuple of partitions such that Bf ~ p^ for each i = 1,. . . ,k and all 
components are independent. Thus, we have that P is a Poisson point process on M + x YlueW [nLi <^^1 with 

intensity measure dt ® A <8> ue <2/ Pv ■ Given P and 5, generate £ := {a" :uef) independently of P and S 
such that <7 V and a w are independent for all v 7^ w and each a" = {of, ..., a£) is an i.i.d. collection of uniform 
permutations of [k] . 

Let T := (T(t),t > 0) be the process on ^w constructed from £ and P, as shown in section [531 so that T is 
a CP( v)-ancestral branching Markov process. Likewise, let X := (X (t),t> 0) be the process on ^\ constructed 
from £ and S shown above. 

Now for all t > 0, let T(t-) = T. Then T(t) = z where 

for each uGf and 7 = 1 , . . . , k which has asymptotic frequency 

k k 



\t u \\ V iIt'IIMr" 11 n*V n't* n« 



1=1 1=1 

Hence we have that \x = M(T) a.s. in this construction and so \x =% M(T). D 

Corollary 6.7. The process M(T) := (M(P(?)),f > 0) exists almost surely. 

7 Weighted trees 

A weighted tree is a fragmentation tree with edge lengths. We write & := 8? x (M + ) to denote the space 
of weighted trees; i.e. each f G 2? is a pair (T, {tj, : b G P}) consisting of a fragmentation tree T and a set of 
edge lengths corresponding to each edge of the tree with the convention that % = if b <£ T. We prefer the 
term weighted tree to the alternative fragmentation process which is generally thought of as a non-increasing 
sequence of random partitions of N, B := (B(t),t > 0), indexed by t G M + , i.e. B(t) < B(s) for all t > s. By 
referring to these objects as weighted trees, we hope to emphasize t G ST as an object, rather than a process. In 
this way, our construction of a Markov process on SfW is naturally interpreted as a random walk on this space 
of objects with only one temporal component, that being how our process on Zfw evolves in time. 

In section[5]we introduce the CP(v) family of AB transition probabilities Q n (T, •; v) for each k > 2, T G 
■%i and V a probability measure on &m ■ The results of section |3l and 15 .2 1 establish the existence of a transition 
measure Q(T, -;v) on BfW with infinitely exchangeable stationary measure p(-; v). 

We now construct a transition probability on ^^, Let T = (T,{tb '■ b G T}) G 2?n and generate f = 
(T', {t' h : b G T'}) G =% by the following two-step procedure. 
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Ancestral Branching with edge lengths Algorithm 

(i) Generate T' from Q n (T, •; v); 

(ii) given T', generate each t b from an exponential distribution with rate parameter dq b (Il T ,l b ; v) (i.e. mean 
l/dqi,(TlT, b ,lb',v)) independently for each b G T', for some 6 > 0. 

This procedure yields a transition density on 3Q ' given by 

Q n (f,f';v) = J] e Pb {n T ^n T , ; ,v)e- et ' MnT r lb '' v) dt^ (23) 

ber 

The purpose of choosing each waiting time t' b to be an exponential random variable with parameter Qq b (n^ , l b ; v) 
is to ensure the consistency of the process under restriction. 

Consider T = (T, {t b :be T}) and f * = (T*,{t* b : b G T*}) such that T* G D^ n+l {T). Then T* has a vertex 
A U {n + 1} with children {n + 1} and A £ T. This is the branch of T on which the leaf {n + 1} is attached. 
Denote this vertex by A* G T* and require that t* b = t b for b £ {A*, A} and t* A , +t\ = t A . We denote by D~ l n+l (?) 
the set of ?* satisfying these conditions. 

Consistency requires that for a tree t" ~<2„+i (?*,•; v), the restriction ?' := ?y is distributed as <2„(?,f , ,•; v). 

Proposition 7.1. Le? V be a probability measure on &m , n>l,f*E ^„ + [ and f" ~ <2,i+i (?*,•; v). Then the 
restriction ?' := ?,7, w distributed as 2„(J]f , , •; v). 

/Voo/ Let f * = (r*,{# : 6 G T*}) G ^ and f" = (T", {t' h ' : b G 7"'}) G ^J}. By construction of £„(•, •; v) 
on £?} for each n > 1, we have that Tjy ~ <2n(?L > I V) and the induced process on boolean trees is consistent. 

Let ^' +1 denote the length of the root edge of t" and consider the length of the root edge of the restriction 
?,y, denoted t' n . If Tl T » / e»+i, then t' n = t" +l . Otherwise, t'„ = t" +l +t". Hence, t'„ ~ T + r'I A where z 
and t' are, respectively, independent exponential random variables with parameters 0g„+i(rir*,l, 1+ i; v) and 
0^„(n r * ,l„;v) for some > and A := {H T n = e„ + i}, the event that the children of the root [n+ 1] in T" are 
[n] and {n + 1}, is independent of T and %' '. 

For notational convenience, we drop the dependence on v and write qb{-,) = <?&(■, - ;v) for any icN, 
likewise for pb(-, •; v), where <?„ and /?„ are defined in section |4] 

An exponential random variable with rate parameter A > has moment generating function $%{t) := 
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\/{X —t). The moment generating function of t' n is 



= Ee'W Tl ^ 

0gn+l(nr*,ln+l) 

09»+i(nr*,l«+i)-^ 

0gn+l(nr*,ln+l) 

8q n+ i(n.T*,l n +i)-t 

9q„ + l(U T *,ln+l) 

Oq„+\(YlT*,hi+i)-t 



E (e n ' lA \A\ P(A) + E (e ,z ' lA \A C ) P(A C ) 

p B+1 (n r .,e H+ i) Mn^i,) | 1 

q n+ l(U.T* ,l n +l) Oq n (TI T * ,l n )-t 



p n +i(nr*,e„ + i) 
9n+i(nr*,l B +i) 



p n+ i(IlT*,e n+ i)0?„(IlT. ,1«) 



^„ + i(nr*,i„+i)(0^ n (riT* i„)-f) 



?«+i (n r * , i„ + i)(e^ n (nT* , i„) - f ) 



^„ + i(n r »,e„ + i)(0^,(riT» i„)-r) 



IM 



q n+ l(Tl T * ,l n+l )(6q„(Tl T * l„)-f) 



0^n+i(nr*,l n 



+1; 



09„ + i(nr*,l n +i)-f 



?n+l (fir* ,l«+l) %n (fir* , 1 B ) - f^„ + i (Il r * , 1„ +1 ) 



q n +\ (TIt* , l„ + i)(0^„(nT* ,1„) - f) 



fp«+i(rir*,e n 



+u 



?«+i (nr* , i„+i )(e?„(n^ ] ,i„)-t 

0<7n+l(nr*,ln+l, 



0^„ + l(IIr*,ln+l)-f 



^„ + i(n r * . i„ + i)0^„(ri2|* , i„) - f^„(riT« i„) 



q n+l {U T * ,ln+i)(dq n {Tl T * 1„) - t) 



9q n (n T * n] ,l n ) 



(24) 
(25) 

(26) 



(27) 



(28) 



(29) 



(30) 



0tf„(IIr ? 1„) — f 

the moment generating function of t'. 

Line (1241) follows by independence of t, t' and A; (1251 ) uses the tower property of conditional expections; 
(l26l ) substitutes explicit expressions for the expression in (1251) ; (l28l) is obtained from (l27l) by canceling terms in 



the numerator; (l29l > follows (T28T ) by fact that q n (Hr* ,l n ) = q n +\(J^T* ,^-n+i) — p B+ i(ri7-*,e„ + i) by consistency 



of (fT4l : finally, (l30l) is obtained by simplifying the expression 

By the branching property of Q„(-, •; v) we have that the restriction fjf^ is distributed as 2«(7|L , •; v). □ 



Finite exchangeability is immediate by inspecting the form of (I23I ). The existence of a transition density on 
2F^ is once again immediate by Kolmogorov's theorem. 

Theorem 7.2. There exists a transition density Q(-, •; v) on f( k > whose finite-dimensional restrictions are given 
fryd23). 

The above process on weighted trees for the CP(v)-ancestral branching process on 3~^ k > is straightforward 
to construct, mainly due to the restriction to trees with a bounded number of children, i.e. each parent can 
have no more than k > 1 children. For this reason, we do not run into issues in our specification related to the 
accumulation of an infinite number of partition events. On one hand, this restriction makes the existence of the 
above process uninteresting probabilistically as we restrict our attention to only a finite number of events. On 
the other hand, this provides an explicit, easily implemented, procedure for generating a random sequence of, 
for example, binary trees, which could be of interest in certain applications. 
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8 Discussion 

Here we have shown an explicit construction of a Markov process on 2? and g? via, respectively, the ancestral 
branching and cut-and-paste algorithms, and under what conditions the AB algorithm characterizes the transition 
probabilities of an infinitely exchangeable tree- valued process. There is potentially a wealth of interesting work 
that can be done by exploring this family of processes in more detail. We provide some details on the ancestral 
branching process associated with the transition probabilities of the cut-and-paste process with parameter v, 
where v is a measure on the ranked-/: simplex. In this case, the associated tree-valued process is restricted 
to c^w. A process based on a more general form of the cut-and-paste algorithm, which is not restricted to 
trees with a bounded number of children, could be interesting to study. However, the case that we study is also 
interesting, in particular in the case where k = 2 and we have an infinitely exchangeable process on the space of 
binary frees. 

For the parametric subfamily of the CP(v)-process with v = PD(— a/k, a), the finite-dimensional transition 
probabilities on SQ for n > 1, a > and t,t' € STn is given by 



Q n {t,t';a) = f] 



2per a/2 (BAB') 



b en,M>2P er ccB-2per a/2 B 

where per a B represents the a -permanent of B, regarded as a 0-1 valued boolean matrix. 

Implications of this subfamily to inferring unknown phylogenetic trees and also to hidden Markov modeling 
in a genetic framework are potentially viable applications of this process. Furthermore, the CP(a,k) subfamily 
is known to be reversible with respect to the Pitman-Ewens family of distributions with parameter (—a,ka), 
yet it is not immediately clear whether this has implications for the equilibrium measure of the associated 
CP( a,k) -ancestral branching process. Connections between these equilibrium measures, and their relationship 
to Aldous's continuum random tree [2] are of interest in this space. 
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